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The squabbles and the policy chal- 
lenges will be more easily resolved if we 
understand their origin. In addition, we 
must focus our attention on the problem 
of institutional capacity and the health of 
capital resources. In comparison with 
what is available elsewhere, and what 
ought to be available to us, our environ- 
ments are significantly worse then they 
were a quarter century ago. We owe to 
the next generation of students and fac- 
ulty members an opportunity to do sci- 
ence as close to the forefront as all of us 
have been able to do it. Commitments 
only to the number of research grants 
next year, or to the total programmatic 
support of research in the federal budget, 
will not make that happen. It will only 
perpetuate the present liability, extend 
the divisions between researchers and 
institutions, and blunt the promise that 
our extraordinary way of doing science 
has created. 
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A wide variety of diseases in many 
animal species are a consequence of in- 
fection by retroviruses (/). A distinct 
group of human retroviruses has been 
isolated from patients with the acquired 
immune deficiency syndrome (AIDS) 
and individuals with related conditions, 
such as persistent lymphadenopathy. 
Several independent isolates, called 
lymphadenopathy-associated virus or 
LAV (2), human T-cell lymphotropic vi- 
rus type III or HTLV-III (J), and AIDS- 
associated retrovirus or ARV (4) by the 
laboratories of origin, are similar with 
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respect to morphology, cytopathology. 
requirements for optimum reverse tran- 
scriptase activity, at least some antigenic 
properties, and some restriction endonu- 
clease cleavage sites in viral DNA. Epi- 
demiological studied show that infection 
by one of these viruses may be a neces- 
sary condition for the development of 
AIDS, although predisposing factors 
may contribute to the onset of the dis- 
ease [3-10). 

Molecular clones of HTLV-III, LAV, 
and ARV-2 have been described ( //. 12). 
These clones provide material for analy- 



ses of viral structure, viral replication, 
and mechanisms of pathogenesis as well 
as for measurements of similarities and 
differences among the retroviruses asso- 
ciated with AIDS and with other retro- 
viruses. In this report, the genetic struc- 
ture of an ARV isolate is established 
from the sequences of molecular clones 
of ARV-2 DNA (12} and from the partial 
sequence of virion proteins. 

The DNA sequence of ARV-2. ProvtraJ 
DNA and circular unintegrated viral 
DNA species from ARV-2 infected cells 
have been cloned in bacteriophage x 
{12), and the structures of five recombi- 
nant phage containing ARV-2 DNA were 
characterized (Fig. 1). The nucleotide 
sequence of various regions of each of 
these molecular clones was determined 
and used to establish the complete se- 
quence of ARV-2 DNA. The sequence 
variations in ARV-2 DNA in these phage 
are presented in Table 1 . 

Long terminal repeat regions (LTR'si. 
The LTR's of retroviruses participate in 
the integration of the virus with the host 
cell and in the regulation of transcription 
of viral genes (13-15). To define the 

The authors are at Chiron Research Laboratories. 
Chiron Corporation. Emeryville. California <M6U8. 
except for Jay A. Levy who ii ai the Cancer 
Research Institute, University of California. School 
of Medicine, San Francisco. California *MUJ. 
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precise oounuanes 01 me LiK se- 
quences, we compared the jun^ons 
with host-cell DNA in the seque^Bof 

\-9B, X-7A. X-8A, and A-7D (Fig. U^Thc 
LTR of ARV-2 is 636 bp and is bounded 
by an inverted repeat of 3 bp (CTG) (Fig. 
2). The sizes of the inverted repeat at the 
ends of the LTRs of the other human 
retroviruses, HTLV-I and HTLV-H. are 
2 bp (16. 17). Integration of proviruses 
did not occur in a specific site in the host 
, cell genome since adjacent cell DNA 
sequences in X-9B. \-8A. and A-7D were 
unique (data not shown). Preceding the 
rightward (3 ) LTR is a polypurine tract 
of 16 bp beginning at position 8632 (Fig. 
2). Polypurine tracts are similarly posi- 
tioned in other retroviruses and play an 
important role in the initiation of plus- 
strand DNA synthesis (/5). Immediately 
downstream from the leftward (5 ) LTR 
is a sequence of 18 bp that is complemen- 
tary to 18 bases of a transfer RN A-lysine 
(tRNA ,ys ) species (Fig. 2). Initiation of 
minus-strand DNA synthesis in retrovir- 
uses requires a host cell tRNA molecule 
as a primer (15). MMTV (mouse mam- 
mary tumor virus) also requires a 
tRNA tys molecule (18). whereas other 
known mammalian retroviruses includ- 
ing HTLV-I and HTLV-II have a tRNA- 
proline primer (16. 17. 19). 

Contained within the LTR's of retro- 
viruses are signals that control initiation 
and processing of viral transcripts (/*- 
15). The cap site and a portion of the 
leader sequence are .specified by the 
LTR. A primer-extension experiment in 
which we used purified virion RNA iden- 
tified the 5'-end of ARV-2 RNA (Fig. 3). 
Thus, the ARV-2 LTR (R-U5 region) 
contributes 182 bp to the leader (Fig. 2). 
Many genes of eukaryotic cells and vi- 
ruses contain a TATA box about 25 bp 
upstream from the start of the transcript 
(20): the TATA box is important for 
positioning the start site of transcription 
(20, 21), In the ARV-2 LTR sequence, a 
TATA box is located at -29 to -25. A 
13-bp palindrome, at -25 to -13, over- 
laps the 3 '-end of the ARV-2 TATA box; 
the significance of this structural feature 
is not known. Another common element 
of eukaryotic transcriptional units, a 
CAAT box, is usually positioned 60 to 70 
bp upstream from the cap site (22). A 
similar feature is not present in the AR V- 
2 LTR. 

A consensus sequence that signals ad- 
dition of polyadehylated tails, A ATA A A 
(23), is located in the rightward ARV-2 
LTR at position 9174 to 9179 (Fig. 2). 
Further downstream in the LTR. be- 
tween 9203 to 9224. is a region that is 
devoid of A residues (Fig. 2). The site of 
addition of polyadenylated [poly* A)] 
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labie i. Poiy morpnism of \ht \ recombinants 
shown in Fig. 1 . 
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•Numbering system as described in Fig. 2. 

tails in the LTR's of many retroviruses is 
followed by a region of 20 to 30 bp that is 
also deficient in adenylic acid residues 
(19). For several eukaryotic genes and 
retroviruses, including MuLV (murine 
leukemia virus). MMTV, RSV (Rous 
sarcoma virus), and RAV-0 (Rous-asso- 
ciated virus), the dinucleotide CA is lo- 
cated ai the poly(A) addition site (19). 
These comparisons were used to pro- 
pose a tentative poly(A) addition site at 
positions 9198 in the rightward ARV-2 
LTR (Fig. 2). 
The enhancer element, generally lo- 



cated upstream from the TATA box. has 
_^^een shown to be an important feature of 
J^p-anscriptional regulation for some eu- 
karyotic genes and viruses (24-28). 
Large repeats, characteristic of some 
retroviral enhancers, are not present in 
the ARV-2 LTR. A close fit for the 
proposed consensus sequence for en- 
hancer elements. (G) TGGttt <G) (29). 
is not found in the ARV-2 LTR. 

The gag gene. The gag region of retro- 
viruses encodes the internal structural 
proteins of the virion (30). A precursor 
polypeptide is synthesized and subse- 
quently cleaved to yield mature gag pro- 
teins (30). The DNA sequence of ARV-2 
predicts a gag precursor of 502 codons 
initiating at the ATG at position 337. the 
first ATG in the proposed full-length 
ARV-2 RNA (Fig. 2). To verify the use 
of this reading frame and to identify 
virion proteins as products of gag. we 
determined partial amino acid sequences 
of two virion proteins. p25 and p 16, 
detected with serum from an AIDS pa- 
tient (Fig. 4) but not with normal human 



Abstract. The nucleotide sequence of molecular clones of DNA from a retrovirus. 
ARV-2. associated with the acquired immune deficiency syndrome (AIDS) was 
determined. Proviral DNA of ARV-2 (9737 base pairs) has long terminal repeat 
structures (636 base pairs) and long open reading frames encoding gag (506 codons), 
pel (H)03 codons). and env (#63 codons) genes. Two additional open reading frames 
were identified. Significant amino acid homology with several other retroviruses was 
noted in the predicted product of gag and pol. but ARV-2 was as closely related to 
murine and avian retroviruses as it was to human T-cell leukemia viruses (HTLV-I 
and HTLV-II }. By means of an SV-40 vector in transfected simian cells, the cloned 
gag and env genes of ARV-2 were shown to express viral proteins. 
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Fig. I. ReMnction endonuclease map of ARV-2. Five recombinant X clones were isolated [12) 
and used to determine the nucleotide sequence of ARV. Clones 8A. 8B. 7D. and 9B represent 
integrated DNA. Clone 7A is from unintegrated DNA (121 The heavy lines indicate regions that 
were sequenced in each clone. The regions that encompass the gag, poL and env ORF's as well 
as two additional open reading frames are indicated. 
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.4*3 CTMAA6€CCTAATTT6€TCCCAAASAA6ACAAGA6ATCCT^^TGTGGATCTACCACACACAAC6CTACTTCCCTSATTGCCASAATTAC/^^ISSSCCACG6ATCA6ATATCCA 

-333 CT6«CTTTQ6ATG6T«CTTCAA6CTA6TACCA6TTGA6CCjHrcAA66TAfiAASA66CCAATGAAG6AGA6AACAACA6CTTGTTACACCCTA^RcT6CATG6CAT6€A66AC6C6 L 

.214 GASAAA6AAGT6TTAGT6TGGAGSTTT6ACAGCAAACTAGCATTTCATCACAT66CCC6AGA T 

-93 CT«MGACTTTCCAG6MGCCGT6GCCT6fi«C666ACTG66GAGTG6C6TCCCTCAGATGCT6CAiA^6CA«TGCTTTTTGCCTGTACT6 GGTCTC TCTGGTTAGACCA6ATCT6AG 

28 CCMGGA6CTCTCTGGCTAACTA6GGAACCCACTCCTTAA6CCTCAATAAAGCTT6CCTTGAGT6CTTCA AGTA6TGTGTGCCCGTCT6TT6TGTGACTCTGGTAAC TA6AGATCCCTCA 

146 GACCCTTTTA6TCAST6T6GAAAAATCTCTACCAC TGGC6eeC6AACAG66A C6C6AAA6C6AAA6TA6AACCA6AGGA6CTCTCTC6AC6CAGGACTCG6C TTGC TGAA6CGCGCAC AG 

ly$GluArg61uNttG1yA1aArgA1aStrValltuStrG1yG1yG1uLtuAsplyirrp6U 21 
260 CAAGAGGCGA6666CGCC6ACTGGTGAGTAC6CCAATTTTT6ACTAGCGGA66CTAGAAG6AGAGA6AGATGGGTGC6AGA6C6TC6GTATTAA6CGG6G6AGAATTA6ATAAATGG6AA 

ly«!1tArgltuArgPro6ly61ylytlyUy*TyrlyUtuly*Hli ll »V«1 TrpAUStrArgGlgLtuGUArgPhtAliVil AsnPpo61yL*uLtuGluThrStrGl uSIjrCjr* 61 
386 AAAATTCGGTTAA66CCAGG6GGAAAGAAAAAATATAAGTTAAAACATATAGTAT6GGCAAGCAG6GAGCTAGAACGATTCGCA6TCAATCCTG6CCTGTTA6AAACATCAGAAGGCTGC 

AroG1nI1tL«uG1yGlflLtuG1nProStPLtuG1nThrGlyS«rGluG1uL«uArgSfrLtgTyrAsflThrV«l AUThrLeuTyrCylVilHMGInArglltAspValLysAspThr 101 
$08 A6ACAAATATT66GACA6CTACA6CCATCCCTTCA6ACAGGATCA6AA6AACTTAGATCATTATATAATACAGTAGCAACCCTCTATTGTGTACATCAAA66ATA6AT6TAAAA6ACACC 

LysGluAl «ltuGlulyiIltG1uG1uGluG1nA«nly«Strlyil.yslyiA1«61n6lnAlaAl tAI aAl a Al aG 1 yThrGl yA»n Str StrGI n Vi 1 St rGl nAinTyrPro 1 1 tVil 141 
628 AAGGAAGC TTTAGA6AAGATAGA6GAAGAGCAAAACAAAAGTAAGAAAAAGGCACAGCAAGCA6CA6C TGCAGC TGGCACAGGAAACAGCAGCCAG6TCAGCCAAAATTACCCTA TAGTG 



GlnAtnL«uGlnGlyGlnNttV«lHlsG1nAl«IUScrProAr9ThrLtuAsnAl«rrpV«UysV«)V«tGluG1uLysAliPhtSarProGluV«1 11 tProNttPhtStrAULiu 181 
748 CA6AACCTACAG6GGCAAATGGTACATCAGGCCATATCACCTA6AACTTTAAATGCATG6GTAAAA6TA6TA6AAGAAAA6GCTTTCA6CCCA6AA6TAATACCCAT6TTTTCASCATTA 

Str61uGlyA1aThrPpoGlnA*pltuA*nThrWttleuAsnThrValG1yG1yM1 sGI n A 1 * A WMtt G1 nMtUtuLy $G1 uThr 1 1 1 AsnGt uGl uAl aAl aGI uTrpA*pArg¥i 1 221 
866 TCAGAAGGAGCCACCCCACAAGATTTAAACACCATGCTAAACACAGTGGGGGGACATCAAGCAGCCATGCAAATGTTAAAAGAGACTATCAATGAGGAAGCTGCAGAATGGGATAGAGTG 

HMProValHtsAI aGlyProlt tAMProGlyGI nNttArgGUProArgGlySerAspMtAl aGl yThr ThrStr Thr LtuGt nGl u61 ft 1 1 tGl y Tr pMtThr AinAtnProPro 261 
986 CATCCAGTGCAT6CAGGGCCTATTGCACCAG6CCAAATGA6AGAACCAA6G6GAA6TGACATAGCA6GAACTACTAGTACCCTTCA66AACAAATAG6ATGGAT6ACAAATAA TCCACCT 

MtProValGlyGluntTyrlysArgTrpMtl WlfuGiyUuAiniyilltVil ArgWttTyr St rProThf Str 1 1 tlt«A$p II tArgGI nGl yProlyiGl uProPhtArgAsp 301 
1106 ATCCCAGTAGGAGAAATCTATAAAAGATGGATAATCCTGGGATTAAATAAAArAGTAAGAATGTATAGCCCTACCAGCATTCTGGACATAAGACAAGGACCAAAG6AACCCTTTA6AGAT 

TyrVaUjpAraPhtTyrtytThfLtuArgA) aGl uGl nAl aStrGI nAs pVa 1 LysAsnTrpMtt ThrGl uTnrLtuLtu Va 1GI nAtnAI aAsnProAtpCy tLyf Thr 11 tLtuLy t 341 
1228 TAT6TAGACCGGTTCTATAAAACTCTAAGA6CCGAACAAGCTTCACAGGAT6TAAAAAATTGGAT6ACAGAAACCTTGTT6GTCCAAAATGCAAACCCAGATTGTAAGACTATTTTAAAA 



AltLtuG1yProAlaA1«ThrLtuG1uG1yMttM«tThrA1«Cy«GlnG1yVa1GlyG1yProG1yH1sly«AIaAr9VaHtuA1aG1uA1aMttStrGlnVa1ThrAsnPPoA1 *A»n 381 
1348 GCATT6GGACCA6CAGCTACACTA6AA6AAATGATGACA6CATGTC AG66A6T6G6G66ACCCGGCCATAAAGCAA6A6TTTT66CT6AAGCCATGAGCCAAGTAACAAATCCAGC TAAC 

HtMttflft61nAr9G1yAinPh«ArgAsfiGlnAr9Ly<ThrVaHysCysPhfAsnCy«GlylysG)uGlyH1iIltAULy«AtnCysArgAlaProApgLy«LysG1yCysTrpAr9 421 
M68 ATAATGATGCA6AGA6GCAATTTTA6GAACCAAAGAAA6ACTGTTAA6TGTTTCAATTGTGGCAAA6AAGGGCACATAGCCAAAAATTGCA6G6CCCCTAGGAAAAA6GGCTGTT6GAGA 

CysG1yAr9G1uGlyH1sG1nMttLytAtpCy«ThrGluApgG1nA1«AinPh#LtuG1yLy«MtTrpPro$«rTyrLysG)yAr9ProG1yAtflPlt«LtuG1nS«rAr9ProGluPro 461 

PhtPhtArgGluAipLtuAl *PhtLtu61«G1yLy»A1aArgG1 uPhiSarSafGlufitnThrArgAU 2 3 
1568 TGT66AAGG6AAG6ACACCAAAT6AAAGATTGCACT6AGAGACA6GCTAATTTTTTA66GAAGATCTG6CCTTCCTACAAG66AA66CCAG6GAATTTTCTTCAGAGCAGACCAGAGCCA 

ThrA1aProPPo61u61uStrPhtAr9Ph«G1yG1uG1gLy«ThpT(»rPPoS«rG1nLytG1»»GluProntAipLyiG1uLtuTyrProL4uThrStrLtuAr9$«rL«uPhtG1yA»« SOI 
AsnS«rProThrAr9Ar9G1uLtuG1nV«lTrpG1>G1yG1uAtnAin$tPLttiS«rG1uAUG1yAlaAspArgG1n61yThrVa1StrPti«AinPHtProGMIl«T(trt.tuTrpGlN 63 
1708 ACAGCCCCACCAGAAGAGAGCTTCAGGTTTG666AGGAGAAAACAACTCCCTCTCAGAAGCAGGA6CC6ATAGACAAGGAACT6TATCCTTTAACTTCCCTCAGATCACTCTTT6GCAAC 

AspProStrStrGI nOC 

ArgProLauV«1ThrI1tArgMtGlyGlyG1nLtulyiG1uA1alawLtuA»pThrG1yA1aAs^^ 103 
1828 GACCCCTC6TCACAATAA6GATAGG666GCAACTAAAGGAAGCTCTATTA6AnCAGCAGCAGAtGATACA6TATTAGAAGAAATGAATTTGCCAGGAAAATG«AAACCAAAAATGATAG 

Glyfilyll tGlyGlyPhtlltLysVal ArgGlnTyrAtpGtrtl 1 tProValGI ul1«CytG1yN1iLytAlaM«G1yThrV«lLt«Va1G1jrProTt»rrroV4tAtAl1«I1t61yAr9 143 
1948 GGG6AATTGGAGGTTTTATCAAAGTAAGACAGTAC6ATCAGATACCTGTAGAAATCTGTGGACATAAAGCrATAGGTACA6TATTAGTASGACCTACACCT8TCAACATAATT6SAA6AA 

AanLtuLtuThrGl n 1 1 t61 yCytThr LtuAsnPhtPro II tStrPro II tGI uTi»r¥t 1 ProVa 1 ly sltuLysProGl yMtt AspGlyProLy t Val LjrtGt nTrpProLtuThrftl « 183 
2068 ATCT6TTSACTCAGATT6GTTCTACTTTAAATTTCCCCATTA6TCCTATTGAAACT6TACCAGTAAAATTAAAGCCAGGAATGGATGGCCCAAAA6TTAA6CAATGGCCATTGACAGAA4 

G1uLysIULysAlaltuV«1G1ul1tCysThrG1uNttGluLy»GtuG1yLy*MtStrLyt MtGlyProG1uAtnPreTyrA««ThrPr«V«irk«A1iI1tL/tL/lLyfAaaStr 223 
2188 AAAAAATAAAAGCATTAGTAGA6ATAT6TACAGAAATG6AAAAGGAAGGGAAAATTTCAAAAATTGGGCCTGAAAATCCATACAATACTCCAttTATTTWTATAAA8AAAAAAGACAGTA 

ThrLyiTrpArgLytLtuValAipPh«ArgG1uLtuAinly»ArgThpGUA»pPhtTrpGI u¥iUULt«Gly II tProH1iProA1aG1yLttLy»LjftL/lLya$tf»a1Thr»a1 283 
2306 CTAAAT6GAGAAAACTAGTA6ATTTCA6A6AACTTAATAAAA6AACTCAAGACTTCT666AAGTTCAGTTA6GAATACCACACCCC6CAMGTTAAAAAAIAAAAAATCAGTAACAGTAT 

LtuAspValGlyAspAlaTyrPhtStrValProLtuAspLyaAapPdtArglysTyrThrAUPhtThrMtProStrlltAtiiAaaGUTterPreGljrlUAriTyraUTyrAaaVal 303 
2428 TG6AT6TGG6T6ATGCATACTTTTCAGTTCCCTTA6ATAAA6ACTTTAGAAAGTATACT6CATTTACCATACCTAGTATAAACAATGAGACACCAGG8ATTAGATATCAGTACAAT6TM 

LauProGlnGlyTrpLytGlyStrProAl alUPhtGl n$trJtfMtThrty*MtLt«C)gProPhtApgLyi«r*»AiiiProAtpMt»a1 11tTyr«UTyr»UtAtpAtpLa«Typ 343 
2548 T6CCACAGG6ATG6AAAG6ATCACCA6CAATATTCCAAAGTAGCATtACAAAHATCTT«iA4bC I rTTACAAAACA6AATCCA6ACATA8TTATCTATCAATACAT©6)AT8ATTT8TAT§ 

V*161yStrA»pLtuGUn tGlyGUHMArgThrLyi llt«) (iGUCtoArgGUMI «Lt«LtyArgTrpGlyPhtThpThrProA»ptyaLjr»N1»GULytGUPraPrtP*tttt 34)} 
2688 TA66ATCT6ACTTASAAATA6«CAGCATACAACAAAAATAGAGftAACTGAMCA6CATCT6TTGA6GU^ 

Trp«ttGlyTypG1«LtuH1 tProAiplyiTrpThr»a1*UProI1tH«tLtMProGl«Ly«AtpStrTrpThr»alA$(»Aipnt«Utytlaa»a1«lyt/tLtaAaaTrpA1a$tf 423 
2 788 66ATG6STTAT6AACTCCATCCTGATAAATGGACAGTACAWCTATAAT6CTGCCA6AAAAAG 

61nIltTyrAUGlylULy$Vt1Ly*61iiLtyCy»LytLttLttAr9«1yTfcrty»AULtyThrGUVal I1tProttyTI»pG1»GUA1a«laLt«61aLtaAla«UA«iiAra61a 483 
AGATTTATGCAGGGATTAAASTAAAGCAGTTATGTAAACTCCTTAMWACCAAAMACTAAr^^^ 

IltLtyLyaGlyPro*«lH1«GlM»a1TyrTyrA»pPfoStrLy«A«pLtt»aU1aGUntGULytGMG1yGU*iy6UTrpTfcpTyr61anaT/r61a61aPraPfc«LytAaa 683 
3028 TTCTAAAAGAACCAGTACATftAAGTATATTATWCCCATCAJUUlfijACTTAGTAGCAGAAATACAGAAGCAGGGGC AAGGCCAATG4ACATAICAAATTTAICAA8A8CCATTTAAAAATC 

LauLytThrGlyLy»TyrAUArfHttAr«81*A1aM1tTlirAa«AspVaH/s6ULt<iTlirGUA)aVal6UlyfValS«rTlir81iiS«rIUfal I WTr§6lyLj« IUProlyt 443 
3M8 TGAAAACAGWAAGTATGCAA6WT8A888GTCCCCACACTAATWT6TAAAACA6TTAACA6AGGCAGTGCAAAAAGTATCCACAGAAAK 

PhtLy«LtyProIltGULyt«UTfcpTrp«l«AUTrpTrpl*tt6UTypTrpG»tA1*TlirTrpntProG»aTfpttltPlitTa1A»aTlirPrtPraLtt»aUyaLaaTrpiTyr61a 183 
3268 TTAAACTACCCATACAAAA«AA«ATM8JWttATG«T«WT«GAGTATT««^^ 

Lt«GluLytGUProMt»a1«1yAlaGltTlirPlitTyf»alA»pGtyA1aA1aAi(iAr9GUThPtyiLttGlyLy»AUGIyTyrfa1TlirA«pArf61yArf8laLyi»a1?a1$«r 42 J 
3388 TAGA6AAA«AACCCATAGTAGWG£A64UACTTTCTATGTA«AT«MGCA6<TAA™^ 

IttAUAspTkrThrAaattMLysTftrGUUaGUAniUNIftlaaAlaLtttGI^ 643 
3S08 TA6CT8ACACAACAAATCAWAGACTGAATTACAAGCAATTCATCTA6CTTTGCAG6ATTC6G4*TTAGAA6TAAA<ATA6TAACAGAC TCACAATATGCATTA66AATCATTCAA8CAC 

GUProAtpLytStrGUStrGULttValStrGl Allt11tG1«GMlt«I1«LyiLyiGlMLytTa1TyrLt«A1aTrpV«}PrtAUHMt/saiyn«61y61yAaa6U61«Va1 781 
3628 AACCA«ATAAGA«T«JWTCAGASTTA«TCA«TCAAATAATAGAGCA6TTAATAAAAAAGGAAAAM^ 

A$pLy«Lt«*a1StrA1aG1y!1tAr9tyt»aUtyPhtLt»AttG1y ntA*pLy«A)aGUGUG1t«M*ltly»TyrliUS«fA»tTpM»'9A1ai*atA1a$trAapP*tAtaLa» 74) 
3748 ATAAATTAGTCA«TGXT6*AATCA«ttAA«TACTATTTTT«^ 
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3866 



ProPro¥alf*1AULytGUMt»«UUStrCyiA*pLy»Cy*«Ml««ty*G1yGI«AlalUtl(**Gly«lt»aU«Ky»S«rPrt«1yI1tTrpai«lt»AtpCytT»rl(1aLtt 761 
CACCT6TAGTASCAAAAGAAATAGTAGXCAGXTGTMTAAATGTCAM 

GluGlyCyt l1tl>tCt«*«IAla»a1Nt»»a1AI«$trGlyTyrntGUA1aGU*a1 IUPrtAUG)tTfcr«lyGUGUTarA1tTyfPlitLt«la«ty«LtiiA1a«1yAr9Trp 629 
3968 AAG6AAAAATTATCCTGGTAGCAGTTCATGTAGCCA6T66ATATATAGAAGCAGAA6TTATTCCAGCAGAG>CA6««AGGAAACA«ATATTTTCTCTTAAAATTA« 
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4iM i i. ^.^*. oow^ , w..^.^ ■ . ..^•Mt^tAiMi iMMfc.H».». HKM^MM.bMHAtlMl 1 l»MA! ( WWW I A4.AA I V *.<. h. AAA* | V *A» 

. 8iyV«ivmiaS4riiBU»aAt*8iiU«t^^«n«MM^ M1 

4229 U8TA6TA4UATeTAT6AATAAT8AATTAA^^TTATA8^ 

81j61ynt«1y81jTyrStrAla6lyGlyA^pW*¥*tA*plWIWA1aT»rA»pl^^^ 943 
4344 •884WTT88*6*AT«A8T8£AA 

At»A«alJfA9ft*r«l««TrH#l81/^ M3 
4488 a£aACaX&TCCCCTTT84U^ 

Aff A»#Tjrr*l/UrUI Mat A1 481 yAtpAapCyl V« 1 Al «S«r Arf81 nk%p%\ gAapM 

4AM mATTAmAAAACAUT*«AMTUTUmm 

4709 TTTTATA8ACATCACTATIjAAA6TACTCATCCAAGA8TAA8TTCA8AA6TACACATCCCC^ 

4628 T«»<ATTTMKCAtWATCKCATAmTAMMAAAAAMAATATAt<ACACAA«TA»MCCCTMCCTAMAtACCAA 

4*41 MTATAAAAAAT8CCATATTA88ATATAM6TTA6TCCTA88TGTGAATATCAA8C^ 

9068 ACAAAACCACCTTTACCTAAT6TTAAUAA€TttACAM6SATAfiAT66AACAA6CCCCAftAAfiACCAA66CCCACAftAfi€6A6CCATACAATCAAT6(ACACTA&A6CTTTTACA66AfiC 

Hit TTAA6A6A6AA8CT8TTAGACATTTTCCTA88CCATBGCTCCATA6CTTA88ACAATATATC TAT6AAACTTAT6868ATACTT666CA88A8T6GAA6CCATAATAAGAATTCTGCAAC 

9308 AACT6CT6TTTATTCATTTCA6AATT666T6TCAACATAGCA6AATAG6CATTATTCAACAGAG6A6AGCAA6AACAAAT66AGCCA6T^ 

MM «MA8TCA8CCTACGACTGCTTSTAACAATT8CTATTGTAAAAA6TCTTSCTTTCATTGCTACGCG^ 

SMI CfiAC6AASA6CTCCTCA66ACAftTCA6ACTCATCAAfiCTTCTCTATCAAASCA(TAACTA6TAAATfiTAATSCAATCTTTACAAATATTA6CAATACTATCATTAfiTA6TA6TAfiCAATA 

SAW A1AGeAATACTTGTGT6SACCATA6TACTCATA6AATATACGJUAATATTAAGACA«^ " 

LytGI/ThrArgArgAiiiTyrGlMlfLavTrpA^TrpGlyTftjrUylattLtttGI^ SI 
S7U AAGGGfiACCA88AS8AATTATCACeACTT6TGGACAT66GGCACCTTCCTeCTTG8GAT8TTW 

TrpLyt6luAUThrThrThrttyrh«C/tAlaStrAtpAUAr9AUTjrrA»pThr61uV4lN1iAtiiV«lTrpAI«TtirMUAUC«f VAl^reThrAtpProAsnProfil ntlyVil 91 
S906 T«fiAAA6AA«CAACTACCACTCTATTTTfiTfiCATCA6AT«TAfiAeCATAT6ATACAeAS6TACATAATATTTfi««CCACACATfiCCT«T«TACCCACA6ACCCCAACCCACAA«AA«TA 

vmtuG1jrAtflUHhr61uAtnM*AtflK«tTrpL>sAtiiAMH«tV4^ ni 
4028 6TATT666AAAT6T6ACA6AAAATTTTAACATCT66AAAAATAACAT6fiTA6AACAfiAT6CA66Afi6ATATAATCA6TTTATA66ATCAAA6CCTAAA(CCAT6T€TAAAATTAACCCCA 

Uyey»¥a1ThrlauAtiiCytThrAipLtM61yLyiA1aTlirAa«TNrAi^ 171 
«148 CTCT6TSTTACTTTAAATT6CACT6ATTTS8«6AA68CTACTAATACCAATA8TASTAATT88AAA8AA6M^^ 

A^AiplyillaGlUytSluAtnAtaUyPhaArgAaiilawAspnilalProMtAipAiiU^ 211 

6268 A6AGATAAGATTCAGAAAGAAAAT6CACTTTTTCGTAACCTT6AT8TA8TACCAATA6ATAAT8C TA8TACTACTACCAAC TATACCAAC TATA88TT6ATACATT6TAACA6ATCA8TC 

ntThrSliiAlaty«»roly»¥a1 S«rPht61uPPoM«^POlltN1 »TypCjsTlirPpoAla€lyM«A1 •IltttuL/fCyf AtnAtnL/fTlirPlitAtn61yL*ift1y»roC««Thr 2S1 
6188 ATTACACA66eCT6TCCAAAG6TATCATTT8AGCCAATTCCCATACATTATTBTAC«^^ 

?!C*il5?rI?r?* 1CI nCjr * T, * rH1 I' •**'«^"*oI S«rT»ir61t«L«t«L4>iaL*«Asfi61 jpS*rt««Al*61 «4S1 M411«VaY V«1 1 1 »Ar9 S*r As »As n P> h« Thr Aft*A»« 2*1 
6S08 AATGTCAGCACA6TACAAT6TACACAT6GAATTA66CCAATA6T8TCAACTCAACT8CT6TTAAAT68CA8TCTA8CA8AA 



6428 
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ii;t'*T5- 1 1* 1 1*** 1 «L««A»»C1 uS*r V« 1 Al • 1 1 »As «Cjr s TNr Ar«r r oAsmAsm As« TMr Ar 9 LjfS S«r 1 1 • r 1 1 ««1 ^ P>ro«1 yAr 9 A 1 * >>H«M 1 sTatr Tt»^«1 y'A#>« 
6CTAAAACCATAATA6TACA6CTGAAT6AATCT6TAGCAATTAACTGTACAAGACCCAACAACAATACAAG 

6748 ATAATAGGAGATATAAGAAAAGCACATTGTAACATTAGTAGAGCACUTGGAATAACACTTTA8AACAGATAGTTAAJUUATTAAJ 

A«;;i;i;rj;rji*;^ «ii 

6868 AATCAATCCTCAG6A6€GGACCCA8AAATTGTAAT6CACA8TTTTAATT6TA6A6M6AATTTTTCTACT8TAATACAAC^ 

.... &Msa:;?a&:;&:!^ 

TyrLysVal lULyilltGt ufroLtyCl/H •AUrroThrLyiAUtytApf Ar t »«) V«1GlMAr9«1«ty»ArtAUV«l81ylUV«ieiyAUII«tM«L««l)yn*L««8tf IU 

7228 TATAAAGTAATAAAAATT6AACCATTAG6AATA6CACCCACCAA86CAAA6A6AA6AGT6GTGCA8A8A6AAAAAAGA8CA6T848AA TA8TA88A8CTAT8TTCCTT8MTTCTTM4U 

A1iAU61yStrThr«ttG1jrAW*«)S«rt«mrl«yThr»m 171 
7348 6CAGCA6&AA6CACTAT6G6C6CA6TGTCATT6AC6CTBAC66TACA68CCA6ACAATTATT6TCT6CTATA8T 

Lt«LavC1nUuThrValTrpGtyIltLy«G1M.a«61M1«Ar9Va)L««AW^^ 611 
7468 CTGTTGCAACTCACAGTCTGGGGCATCAA8CA8CTCCA88CAA8A8TCCT66CT8T66AAA8ATACCTAA88MTCAACAKTCCTAM 

ThrThrAlaVtt>roTrpAi«A1iStrTrpS«rAmysS«rUyG1«A«pIWTrpAM »»1 
7S88 ACCACT6eT6T6CCTT8GAAT6CTA8TT6tt6TAATAAATCTCT66AAW^ 

b! gL !" fil »«1 »S«r 81 «AmG1 ii81 a61 uLyt At«81 «61 a61 «La«l««81 nL««AtpLytTrpAUS«rL*«Tf pAs*Trp»ft«S«r I WT*rAt«TrpU«TrpTyr II K*« 1 1 « 691 

7708 TTACTTGAAGAATCGCAGAACCAACAA6AAAA8AAT8AACAA8AATTATTA8AATT88ATAA8TM8CAA6TTT6 

Mant*ttl1aVa1G1y61yU«Va161yU«Art!laVa1»l*AUValU»^^ 731 
7816 TTCATAAT6ATAGTAG8A88CTTGGTAG8TTTAA6AATA8TTTTTKTGTMTTTCTATA8T 

l««M«S«rTyrAroAr9U«Ar9AMU«L««Lt«n«AM^^ 611 
8068 CTCTTCA6CTACC8CCKTTUM8ACTTACTCTT6ATTMAtjCM6M 

Sl2SlTi?Tt^Jl?;552C5IS5Si?2rIr5it***** * 4iTI»a* • X I *A1 • V« l TAa^«1 ««1 jrTto«-Ast»A»*9jV4ftl 1 9 «49il «V«Y A Tjr^A»At All 4»t>aM6«8 • 1 1 • 661 

8188 CA6MACTAAA6AATA8TGCT6TTA6XT8GCTCAACKCACAGXTATAGCA6TAACT 

Ar f Ar f 1 1 tAraG t «61 jt«i81 «Ar«U«Uiil««0C 

6428 TftAGCCAftCACf A&ATttftfifiTfeftttAfl£AaiTATC ItCAC^rJC.CAAkAArkJc.^AC^AATrArAAcrkD^hMrmrA^A^rArrAArA^rAArr^w^^j^^^j^^^^^^^^^^^ 

8S48 6GAAM6CTM8TTTTCCA6TCA6ACCTCAMTACCTTTAA8ACCAATUCTTAC084KA CtSuAAMCTAATTTMT 

8667 CCCAAA6AAGACAAMMTCCTT6ATCTGTG8ATCTACCACACACM 

8787 TCAAGC TACTACCACTTflAftt t AfcAftAAfcflTAftAAftACCff AaTcaaccacacaat AAr act TTCTTAf Afrr TATCAarr *T*iiAATCAAAAA^«rf^Af * |y ^ ^ 

6907 GMGGTTTWCAGXAAACTA«ATTTCATCACAT8«CCGA6A«T«ATCC6«A6TACTACAAA«ACT8C 

9027 «WGT6«CTGGGCG4«ACTG*«4UGTG«CGTCCCTCA4UT«T«ATATAA«A«TGXTTTTTGCCT«TACT8 88TCTCTCTG<TTA4MCCA«ATCTU«CT844UUCTCK T84C 

9148 TAACTAGG«ACCCAaGCTTAAGCCTCA^££A.GCTT6CCTTGAGT6CTTCA AGTAGTGT«TG«CC«TC TGTTGTGTGACTCT88TAACTW8ATCCCTCAGACCCTT T fA4TCA4T 

928$ GTG6AAAAATCTCTACCA8 
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Fig 2 (oaaes 486 and 487)' Nucleotide sequence DNA. The predicted amino acid 

Sii^ae products of the gag. poL and #es are indicated. The I 3. R. anc 1 U 
■ Ss of the LTR's are also designated. The cap site, as determined from the "penmen 
Thown in F g 3 is position + 1 . A 3-bp inverted repeat at the ends of the LTR. the TATA box at 
Sn -29, he sequence complementary to the 3'-end of the tRNA ys at position 183. and , the 
Z vadenvlat on signal at position 9174 are underlined. The overlines indicate the ammo acid 
fi^K^'ned from virion proteins (Fig. 4). The nucleotides at the beginning ; of eac 
Hneare numbered and the amino acids at the end of each line are indicated. Methods. Restriction 
Szyme DNA "nts of recombinant phage DNA (Fig. 1) were isolated after 
in DoTvacrvlamide or agarose gels, cloned into M13 vectors, and used as templates tor DN A 
eauencin^ T the aSeoxy chain termination method i50). Oligonucleotide pnmers tor 
slauencint were chemically synthesized by solid-phase phosphoramidite chemistry on an 
Ap'pU^ machine. The limits of the LTR's were established by the 

sequence of both ends of proviral DNA as well as the sequence of a permuted clone t , A m F.g^ 
n q For pro e.n sequencing, 0.38 mg of punned virus was subjected to electrophoresis on a 12 
StSilLde Laemmli gel and the bands corresponding to P I6*«* and p25*«* were 
^ out and electroeluted by the method of Hunkapiller */ t.W j. N Hrjermmal 
in R of these proteins was carried out as described by Hunkapiller et «/. 52). COOH-termmal 
analvs ; was by the carboxypeptidase digestion procedures of Hayashi (5.0. The compiled 
ARvTdNA sequence, including both copies of the LTR. is 9737 hp in length. The analyses of 
The genetic organization of ARV-2 draws on comparisons »uh other retroviruses. For these 
comparisons we used computer programs such as MALIGN to identify homologous reg ons 
among DNA sequences and protein sequences. Structural relations were also investigated 
3ted proteins from ARV-2 open reading frames .ere analyzed tor hydropathy pattern^ bv 
the method of Hopp and Woods (54) and for specific structural features by a modification of the 
method of Chou and Fasman (55). These two parameters .ere combined to determine regions 
of a pVotein that may be on the surface, particularly loops composed of hydrophil.c residues. 
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Fig. 3 (left). Identification of the 5'-end of ARV-2 RN A. Viral RNA was isolated from v no 
</?) and used as a template for Klenow fragment of DNA polymerase t with the synthetic 
oligonucleotide 5'GGGCACACACTACTTGAAGC as a pnmer. An M 1 3 clone containing the 
leftward LTR of ARV-2 was also primed with the same oligonucleotide in the presence ot 
dideoxynucleotides (50). Both reactions were resolved on a sequencing gel. Lane I corresponds 
to the primer extension reaction with ARV-2 RNA template. Lanes 2 3. 4. and 5 correspond to 
C T A and G, respectively, of the sequencing reactions of the MI3 recombinant clone. 
Fig 4 (right). Polypeptides of purified virus. Gradient purified ARV-3 (5 n-g per lane» was 
subjected to electrophoresis on a 12 percent polyacrylam.de gel according to the ™«hod of 
Laemmli (56). Lane A, staining with Coomassie brilliant blue. Lane B (immunobloti. polypep- 
tides transferred to nitrocellulose (57) and treated first with a I: 500 dilution of serum from an 
AIDS patient (EW51 1 1 reference serum from P. Feorino. Centers for Disease Control. Atlanta. 
Georgia) and then with a 1 : 200 dilution of horseradish peroxidase^onjugated goat antiserum to 
LJn immunoglobulin G (Cappel Laboratories. No. 3201-0081). The color 
Color Development Reagent (containing 4-chloro-l-na P thol; B.o-Rad). The rrtokcu ar Rights 
of protein markers subjected to electrophoresis in parallel lanes are shown ,n 
left P25 and pl6 indicate the bands that correspond xo.p25gag and p\6gag that were used as 
substrates for amino acid sequencing. 
488 



control s^rn (data not shown). Virion 
proteins^Be isolated from a polyacryl- 
amide geRnd the first 30 amino acids at 
the NH 2 -terminus of pl6 and the first 20 
of p25 were determined by gas-phase 
microsequencing. Alignment with the 
DNA sequence (Fig. 2) suggests that the 
first gag polypeptide is 134 amino acids 
in length and may correspond to a 
pXlgag virion protein species seen on 
polyacrylamide gels (unpublished re- 
sults). The NH r terminus of p25 is gener- 
ated by a cleavage between Tyr-138 and 
Pro- 139 (Fig. 2). Proline is present at the 
NH : -terminus of at least three other ma- 
jor retroviral gag proteins <p25j?«* of 
HTLV-1, pllgag of RSV. and pNgag of 
MuLV) (16. 19). A protease with this 
cleavage specificity has not yet been 
identified in ARV-2. but this activity can 
be encoded by a retrovirus i30). The 
carboxyl terminus of p25gag was deter- 
mined by digestion with carboxypepti- 
dase and yielded the sequence Arg-Val- 
Leu (amino acids 367. 368. and 
respectively). The NHrterminus of pi* 
is generated by cleavage between Mct- 
383 and Met-384 (Fig. 2). Processing ji 
this site may involve chymotrvpsin or a 
chymotrypsin-like enzyme, which ^ be- 
lieved to process part of the gun precur- 
sor polypeptide in other retroviruses 
130). The COOH-terrninus of plo proba- 
bly occurs at Gln-506 since a translation- 
al stop codon follows (Fig. 2). although 
further proteolytic processing could also 
be involved. 

A small amount of amino acid se- 
quence homology is noted when p25j?uir 
of ARV-2 is compared to p2**ag of 
HTLV-1 (16) (data not shown). This ho- 
mology involves the position of two cy v 
teine (C) residues relative to the COOH- 
terminal of both proteins (Fig. 2). Al*>. 
four of five amino acids at the COOH- 
terminus of p25gag of ARV-2 match 
those at the COOH-terrninus of pUga* 
of HTLV-I (Fig. 2) (16). A preponder- 
ance of hydrophobic residues character- 
izes these proteins. 

Sequence comparisons of pl6fa* of 
ARV-2 with p\6gag of HTLV-I </6L 
p[2gag of RSV (/9), and pl5ffajr of 
MuLV (79) reveal the best homotofy 
(Fig. 5). The relative positions of the Ave 
Cys residues in each of these three pro- 
teins are closely conserved and all three 
contain a high proportion of hydrophtlic 
residues. 

The pol gene. The pol region encodes 
the virion RNA-dependent DNA polym- 
erase (reverse transcriptase). Several 
additional enzymatic functions related to 
replication are controlled by this region, 
including ribonuclease H, a DNA endo- 
science. vol :r 



nuclease, and. in some retroviruses^ 
protease (15, 30). An open reading fn^V 
of 1003 codons appears to be the ARV-2 
pol domain (Fig. 2). Some homology at 
the protein level is observed in the NH 2 - 
terminal portions of the predicted pol 
genes.of ARV-2, HTLV-I (16. 31), RSV 
(19), and MuLV (16) (Fig. 6). This region 
is also homologous to portions of the 
putative viral polymerases of hepatitis B 
"viruses and cauliflower mosaic virus 
(31). Analysis of the remainder of the pol 
genes of ARV-2, HTLV-I. RSV. and 
MuLV demonstrates appreciable homol- 
ogy in protein structure and sequence 
near the COOH-termini (16, 19, 32, 33) 
(Fig. 7). A 32-kD polypeptide is pro- 
duced by proteolytic processing near the 
COOH-terminus of the RSV pol poly- 
peptide precursor (33), Alignments of 
shared amino acids in this region of the 
ARV-2 pot gene (in particular. Cys resi- 
dues) with the defined NH 2 -terminus of 
p32 of RSV (33) permits tentative identi- 
fication of a processing site for the coun- 
terpart protein (Fig. 7). 

The env gene. The env region encodes 
the major glycoprotein found in the 
membrane envelope of the virus and in 
the cytoplasmic membrane of infected 
cells (30). Retroviral env proteins arise 
generally from a precursor polypeptide 
that is processed at two or more sites: 
the first processing event removes a sig- 
nal peptide of about 30 amino acids and 
the second yields a COOH-terminal 
polypeptide containing a hydrophobic 
stretch (about 22 amino acids) that spans 
the membrane and is followed by a hy- 
drophilic cytoplasmic anchor (30). Re- 
sults of transient expression experiments 
in mammalian cells (see Fig. 8) indicate 
that serologically reactive ARV-2 env 
protein is initiated downstream from the 
Sst I site at position 5555 to 5560 < Fig. 2), 
We propose that the ATG at position 
5779 (34) initiates the env precursor, but 
direct determination of the NH r termini 
of the env precursor polypeptide and of 
processed forms will ultimately be re- 
quired to establish the biogenesis of env 
proteins. Two other potential initiation 
codons are near the 5 '-end of the same 
long open reading frame (863 codons) 
proposed to encode the ARV-2 env pro- 
tein (positions 5845 and 5851, Fig. 2). 

Secondary structure analysis shows 
that the COOH-terminal region is orga- 
nized into predominantly ct-helices and 
3-sheets; the NH 2 -terminal half appears 
to have many hydrophilic loop regions 
(see legend to Fig. 2); similar structural 
properties characterize the domains of 
env gene products of other retroviruses 
(data not shown). A tentative assignment 
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of a processing site for ARV-2 env in- 
cludes the sequences Lys-Arg-Arg or 
Lys-Arg (Fig. 9). Which of these sites is 
used remains to be determined. Process- 
ing in this region will generate final prod- 
ucts of 59 and 42 kD without accounting 
for carbohydrates residues. The NH r 



719 730 

ARV GIDKAOEEHEKYHSNWRAMA 

HTLV-I QLSPA-ELHSFTHC60TALT 

RSV PLREAK0LH.IALHI6PRALS 

MuLV QLTHLSFSKMKALLERSHSP 



740 750 

ARV SDFNLPPvvAKEj_vASC_DKC 

HTLV-I LOGATTTEAASN!.LRSCHAC 

RSV KACNLSSQQAREVVQrCPHC 

KuLV YYMlNRDR T LKNj_TETCKAC 

760 770 

ARV OLKGEAT mGOVUC 

HTLV-I rggmpohumprghir-rgll 

RSV NSAPALEAGVNP - - G 

KuLV A QVNAS<SAVK QGTRVRGHR 



ARV S P G I W fl L l'Ci H L i G K 1 I L - - 

HTLV-I ? K KJL « QG D I T H F .< Y K N T L T R 

?SV ?i u i_ * Q T IF L**LL ? RSWL-- 

V => G T H -h E I D F T £ I < ? G L Y G Y K 

790 800 

arv ---vavhvasgyi_ea_e-vip 

htlv-i - l h v h 0 h sg_a]_saq-lrk 

rsv --avtvdt_assa[vvtqhgr 

MuLV Y L L V F ! D T F S G W i_ E A F P T K K 



•piinal and COOH-terminal portions 
Fitain, respectively, 26 and 5 potential 
NH 2 -linked glycosylation sites (Asp-X- 
Thr\ Asp-X-Ser) (Figs. 2 and 9). Cyste- 
ine residues are asymmetrically distrib- 
uted as in other retroviral env gene prod- 
ucts (19). The NH : -terminal domain has 



810 (120 

ARV AETGOETAYF! fc L<LA_GR-WP 

HTLV-I GISSEAISSLLQ»!Aml-GK 
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Fig. 7. Homology of amino acids in the 
COOH-terminai portion of the pol genes of 
ARV-2, HTLV-I. RSV, and MuLV. Identical 
amino acid residues are underlined. Positions 
of cysteines are noted with asterisks. ARV-2: 
719 to 878 (Fig. 2). HTLV-I: amino acid 599 to 
766 (10). RSV: amino acid 568 to 743 iI6). 
MuLV: amino acid 846 to 1019 U6). Numbers 
indicate amino acid positions (Fig. 2). 
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Fig. 5 (lefU. Homology of amino acids in 3 -Sv v L i ?_ L i L *: - 1 > . : v H 1111 
regions of the v<'V gene of ARV-2. HTLV-L ^uL* ') ^ A n 1 i ; 0 1 L _ . * ^ v 2 
RSV, and MuLV. Identical amino acid resi- *~ 
dues are underlined. Positions of cysteines ^Rv / 7 g > "» * 1S<1 

are noted with asterisks. ARV-2: pl6.eae, ' 
amino acid 14 to 51 (Fig. 2). HTLV-I: pl2j?a#. 
amino acid 12 to 50 ( 10). RSV; pl2i?«e. amino 
acid 20 to 61 ( 16). MuLV: p!0?«# . amino acid 
25 to 60 [16). Numbers indicate amino acid positions (Fig. 2). Fig. 6 (right). Homology of 
amino acids in the NH : -terminal portion of the pol genes of ARV-2. HTLV-I. RSV. and MuLV. 
Identical amino acid residues are underlined. ARV-2: amino acid 262 to 352 (Fig. 2). HTLV-I: 
amino acid 1 10 to 196 RSV: amino acid 113 to \11 (16). MuLV: amino acid 265 to 351 U6). 
Numbers indicate amino acid positions (Fig. 2). 
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Table 1. Summary of homologies of ARV-2 with other retroviruses. Homologies are given as 
percentages from the MALIGN program. 



Virus 


(amino acid 393 to 
429. Fig. 5) 


J^F? pot 

(amino acid 303 to 
390. Fig. 6) 


ARV-2 pol 
(amino acid 719 to 
878. Fig. 7) 


Amino . 
acid 


Nucleo- 
tide 


Amino Nucleo- 
acid tide 


Amino 
acid 


Nucleo- 
tide 


HTLV-I 

RSV 

MuLV 


46 
39 
27 


7 
7 
10 


42 1 ' 
49 15 
35 15 


28 
28 
12 


19 
12 
23 



18 Cys residues and the COOH-terminal 
portion has 3 Cys residues. Two large 
hydrophobic regions are evident in the 
COOH-terminal domain (Fig. 9). The 
rightward hydrophobic stretch is long 
enough (23 amino acids) to span mem- 
branes. 

Expression of cloned ARV genes. In 
an attempt to obtain ARV antigens with- 
out the production of infectious virus, an 
SV40 vector system was used to express 
the candidate gag and env genes in trans- 
fected mammalian cells. The criterion 
for expression was serological reactivity 
of fixed cells with serum from AIDS 
patients in immunofluorescence tests. 
Recombinant SV40 plasmids containing 
these genes were transfected into 
5 x 10 4 COS-7 monkey cells growing on 
microscope slides (Fig. 8): after 60 
hours, cell monolayers were fixed and 



treated with AIDS patients* sera or nor- 
mal human control sera and then with 
fluorescein-labeled goat antiserum to hu- 
man immunoglobulin G (Fig. 8). Approx- 
imately 5 percent of cells transfected 
with pSV7c/gag showed a speckled pat- 
tern of immunofluorescence throughout 
the cytoplasm with AIDS patient serum 
EW5111 (Fig. 8A). Antiserum MC from 
a patient in the early stage of AIDS 
appeared not to react with cells trans- 
fected with pSV7c/gag (data not shown). 
By immunoblot analysis with proteins 
from purified ARV-2. antiserum MC was 
shown to have very low levels of anti- 
body to p25gag. whereas antiserum 
EW51I1 readily reacted to plSgag. Se- 
rum from normal individuals gave no 
appreciable fluorescence (data not 
shown) in cells transfected with pSV7c/ 
gag cells transfected with the vector 




Fig. 8. Expression of cloned ARV genes in mammalian cells. ARV-2 DNA fragments containing 
the gag and env genes were prepared as follows: \-7A DNA ( Fig. I ) was digested with Sst I and 
Kpn I and the 3. 1-kb gag DNA fragment was purified by electrophoresis in low-melting agarose 
gels(7);\-7DDNA(Fig. I) was digested with Sst I and Kpn I and the 3.2-kb env DNA fragment 
was similarly purified. Each of these fragments was cloned into a modified form of a plasmid 
containing the SV40 origin of DNA synthesis and the promoter and polylA) addition regions of 
the SV40 early gene (58. 59). Both ARV gag and env DNA fragments contain ATG start codons. 
pSV7c/gag utilized a TAA stop codon in SV40 DNA. pSV7c env has the TAA stop codon at the 
end of the open reading frame for env (Fig. 2). COS-7 monkey cells, expressing the SV40 early 
gene, were grown on glass microscope slides, transfected with plasmid DNA by the calcium 
phosphate coprecipitation method i60). incubated for M) hours, and fixed in cold acetone. The 
fixed cell monolayers were treated for I hour at }TC with a 1 : 200 dilution (in PBS with 5 
percent fetal calf serum) of an AIDS reference serum < Fig. 4) or with a similar dilution of normal 
human control serum. Cells were washed in PBS and treated for I hour at 37°C with fluorescein- 
labeled goat antiserum to human immunoglobulin G iCappel Laboratories). In all cases, sera 
were preadsorbed on normal COS-7 cells that had been fixed with 0.2 percent paraformalde- 
hyde. Shown here are fluorescence photomicrographs I > M0) of cells transfected with (A) pSV 
7C/gagand (B) pSV7C/env. About 5 percent of cells in a monolayer expressed viral antigens. 
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plasmid p6V 7c. containing no AK v 
DNA, d^not fluoresce with any serum 
sample^Am AIDS patients. About 5 
percent^ cells (from 5 x I0 4 cells per 
microscope slide) transfected with 
pSV7c/env and treated with either 
EW51 1 1 or MC antiserum showed bright 
immunofluorescence largely confined to 
the cytoplasm in a netlike pattern (Fig. 
8Bh These patterns may be a conse- 
quence of the fixation procedure or may 
indicate that viral env protein is localized 
in structures such as endoplasmic reticu- 
lum inside the cell. No fluorescence was 
observed in cells transfected with 
pSV7c/env and treated with normal hu- 
man control sera (data not shown). 

Discussion. The complete DNA se- 
quence of ARV-2 reveals a fundamental 
genetic structure similar to that of other 
retroviruses. Several features of ARV-2 
indicate that it is no more closely related 
to the other human retroviruses HTLV-I 
and HTLV-II than it is to avian or mu- 
rine retroviruses. 

ARV-2 has an inverted 3 bp repeat 
(CTG . . . CAG) at the ends of the LTR. 
All other retrovirus LTR's have 
TG . . . CA at their ends as pan of a 2- 
to 16-bp inverted repeat {14). The MuLV 
LTR has two direct repeats 72 hp long 
located in an internal position within the 
LTR (19). HTLV-II has several direct 
repeats, one of which is 21 bp long and is 
very similar to a 21-bp repeat in HTLV-I 
(16, 17). RSV, however, is like ARV-2 
and has no large direct repeats in its LTR 
(19). In the ARV-2 LTR, the proposed 
poly(A) addition site is 20 bp down- 
stream from the consensus poly(A) addi- 
tion signal. AATAAA (Fig. 2); thus, the 
R region is 97 bp long [measured from 
the cap site to the poly(A) site). The 
poly(A) addition sites of MuLV and RSV 
are about 20 bp downstream from AA- 
TAAA found in each LTR; these viruses 
have R regions 68 bp and 21 bp, respec- 
tively (19). In contrast, in HTLV-I and 
HTLV-II, the AATAAA sequence is lo- 
cated upstream from the TATA box; R is 
229 bp in HTLV-I and 287 bp in HTLV- 
II (16, 17). ARV-2 and MMTV have a 
tRNA lys for priming minus-strand DNA 
synthesis (Fig. 2) {IS): avian retroviruses 
use tRNA lrT> and other mammalian retro- 
viruses use tRNA pro (19). 

The gag regions of MuLV and RSV 
encode precursor polypeptides that are 
cleaved into at least four and five pro- 
teins, respectively (30). Both ARV-2 and 
HTLV-I encode a gag precursor that 
appears to give rise to three proteins 
(Fig. 2) (16). A small amount of homolo- 
gy of amino acid sequences was noted in 
the COOH-terminal portion of gag in 
these viruses; ARV-2, HTLV-L and 
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IAPTXAKRRVVQREKRAVGIVGAMF 
502 526 

Fig. 9. Schematic diagram of ARV-2 env open reading frame. Numbers refer to amino acids in 
the open reading frame proposed for env (nucleotides 5755 to 8346, Fig. 2). Symbols: A, 
cysteine residues: ▼. potential N-glycosylation sites; ♦. hydrophobic regions. The two putative 
processing sites for generating NH 2 - and COOH-terminal domains are underlined. 



RSV were found to be similarly related 
in this assessment (Fig. 5 and Table 2). 

Different retroviruses use different 
mechanisms to synthesize and translate 
the pol gene messenger RNA (16). Eluci- 
dation of pol biogenesis in ARV-2 will 
require detailed analyses of splicing pat- 
terns of viral mRNA in infected cells 
together with studies of the polypeptide 
intermediates. ARV may be different 
from all other retroviruses since the 
COOH-terminal end of the proposed pol 
gene does not overlap the NH 2 -terminal 
end of the proposed env gene. 



The predicted ARV-2 env polypeptide, 
like that of other retroviruses, has a 
hydrophilic NHyterminal domain and a 
COOH-terminal portion characterized 
by a long stretch of hydrophobic amino 
acids (23 amino acids long) (Fig. 9). The 
NH : -terminal domain of ARV-2 env con- 
tains 26 potential glycosylation sites, an 
unusually high number when compared 
to other retroviruses: HTLV-I has 5 (17). 
HTLV-II has 6 (35). RSV has 17 (24), 
and MuLV has 7 (24). The extent and 
function of glycosylation in retroviral 
env proteins remain to be investigated. 



m rviw-- inert; are two ciuuitionai 
jgen reading frames designated ORF-l 
")i ORF-2 (Fig. 10). Near the 5 '-end of 
ach open reading frame is an ATG that 
is flanked by purine residues at -3 and 
+4; thus, these ATG codons are poten- 
tial start codons (31). HTLV-I (16). 
HTLV-II (17), and BLV (36) contain 
open reading frames that initiate beyond 
env and extend into the rightward LTR; 
this location is analogous to that of ORF- 
2 in ARV-2. Comparisons of ORF-2 in 
ARV-2 with counterpart regions in these 
other retroviruses revealed no apparent 
homology at the DNA and protein levels 
(data not shown). For HTLV-I and 
HTLV-IL these regions are expressed as 
proteins that are implicated in viral 
pathogenesis (37, 38). Assessments of 
patterns of transcription and polypeptide 
synthesis will be essential to determine 
whether or not these ARV-2 open read- 
ing frames are expressed. 

Certain taxonomic issues need to be 
addressed with respect to the relation- 
ships among the human retroviruses at 
the nucleotide sequence level. A probe 
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representative of ARV-2 anneals under 
high stringency conditions to restriction 
enzvme DNA fragments from cells in- 
fected with LAVorwithHTLV-III(J9). 
Thus these three retroviruses are close- 
ly related. In addition, we have shown 
that the probe to ARV-2 anneals under 
high stringency conditions to proviral 
DNA of two independent isolates, ARV- 
3 and ARV-4 (12). At the protein level, 
very low homology is evident when 
ARV-2 genes are compared with those of 
HTLV-l (Figs. 5 to 7 and Table 2); 
homology at the nucleotide level is even 
lower because of degeneracy of codons 
(Table 2). In our assessments. ARV-2 
appears to be no more closely related to 
these other human retroviruses than it is 
to RSV (Table 2). Subhuman primate 
endogenous viral sequences (40, 41) are 
also distantly related to ARV pot (data 
not shown). Hybridization and annealing 
studies under very low stringency condi- 
tions demonstrated detectable homology 
of HTLV-1II with HTLV-I and HTLV-II 
(// 42). Our homology assessments at 
the' nucleotide level (Table 2) indicate 
that stable hybrids or duplexes cannot 
be formed between ARV-2 DNA and 
HTLV-I DNA under these conditions. 
These issues could be fully resolved by 
comparing the DNA sequences of the 
genomes of retroviruses associated with 
AIDS (LAV, HTLV-IIL and ARV). 

The pathology that attends ARV infec- 
tion is a unique aspect of this retrovirus. 
Selective tropism for human T-helper 
cells, syncytia formation, and cell killing 
are characteristics of ARV infection in 
tissue culture cells {2-4, 43). Attachment 
of virus to cell receptors and fusion of 
membranes are two properties con- 
trolled by the env gene that probably 
play a fundamental role in viral patho- 
genesis. The predicted sequence of 
ARV-2 env will be used to design muta- 
genesis experiments aimed at determin- 
ing the function of env in attachment and 
fusion. LTR's of some avian and mam- 
malian retroviruses have been shown to 
control tissue tropism, leukemogenicity, 
and specific disease patterns (44-48). 
Whether or not the ARV LTR plays a 
role in any of the pathologic manifesta- 
tions associated with ARV infection re- 
mains to be established. 
Sequence variations in ARV may be 



an important feature ot viral patnogenc- 

•iat would enable the virus to evade 
immune responses. Many viruses 
show sequence variation during passage. 
Infection of an animal with equine infec- 
tious anemia virus (EIAV) leads to dif- 
ferences in the env protein of progeny 
virus, probably as a consequence of im- 
munological selective pressures in the 
host {49). Our studies of ARV have dem- 
onstrated sequence differences (i) in sep- 
arate molecular clones of one ARV-2 
isolate (Table 1) and (ii) in independent 
ARV isolates (12). Biological activity of 
cloned ARV-2 DNA has not yet been 
assessed by transfection of permissive 
cells. The generation of sequence varia- 
tion in the ARV-2 genome can be studied 
by analyzing viruses recovered from dif- 
ferent molecularly cloned ARV-2 
DNA's. These approaches could provide 
insight into methods by which the viral 
infection could be prevented, modified, 
or eliminated. 
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